Random Sampling in an Age of Automation: Minimizing Expenditures through Balanced Collection and Annotation
نویسنده
چکیده
Methods for automated collection and annotation are changing the cost-structures of sampling surveys for a wide range of applications. Digital samples in the form of images or audio recordings can be collected rapidly, and annotated by computer programs or crowd workers. We consider the problem of estimating a population mean under these new cost-structures, and propose a Hybrid-Offset sampling design. This design utilizes two annotators: a primary, which is accurate but costly (e.g. a human expert) and an auxiliary which is noisy but cheap (e.g. a computer program), in order to minimize total sampling expenditures. Our analysis gives necessary conditions for the Hybrid-Offset design and specifies optimal sample sizes for both annotators. Simulations on data from a coral reef survey program indicate that the Hybrid-Offset design outperforms several alternative sampling designs. In particular, sampling expenditures are reduced 50% compared to the Conventional design currently deployed by the coral ecologists.
منابع مشابه
Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملBalanced Acceptance Sampling+m: A Balance Between Entropy and Spatially Balance
Balanced acceptance sampling is a relatively new sampling scheme that has potential to improve the efficiency of spatial studies. There are two drawbacks of the design, it can have low entropy and some of the unbiased estimates can not be calculated. In this paper, such shortcomings have been addressed by integrating simple random sampling with balanced acceptance sampling. In a simulation stud...
متن کاملThe Effect of Socioeconomic Factors on Household Health Expenditures: Heckman Two-Step Method
Background and Aim: Due to the high level of out-of-pocket payments for health expenditures and the importance of household health expenditure management, this study aimed to investigate socioeconomic factors affecting Iranian urban households’ health expenditures. Materials and Methods: This descriptive-analytic and applied study was conducted cross-sectionally at national level with microeco...
متن کاملIranian EFL Learners L2 Reading Comprehension: The Effect of Online Annotations via Interactive White Boards
This study explores the effect of online annotations via Interactive White Boards (IWBs) on reading comprehension of Iranian EFL learners. To this aim, 60 students from a language institute were selected as homogeneous based on their performance on Oxford Placement Test (2014).Then, they were randomly assigned to 3 experimental groups of 20, and subsequently exposed to the research treatment af...
متن کاملBoard Level Placement
Because of complexity in modern printed wiring board (PWB) designs which can consist of a large number of components, the placement of the components has become an extremely time consuming task and may involve tradeoffs of equally important weight. To help a design engineer, placement automation techniques have been developed. Most of them are developed for improving wireability [Che84, Got78, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014